Bayesian Population Genomic Inference of Crossing Over and Gene Conversion

نویسندگان

  • Badri Padhukasahasram
  • Bruce Rannala
چکیده

Meiotic recombination is a fundamental cellular mechanism in sexually reproducing organisms and its different forms, crossing over and gene conversion both play an important role in shaping genetic variation in populations. Here, we describe a coalescent-based full-likelihood Markov chain Monte Carlo (MCMC) method for jointly estimating the crossing-over, gene-conversion, and mean tract length parameters from population genomic data under a Bayesian framework. Although computationally more expensive than methods that use approximate likelihoods, the relative efficiency of our method is expected to be optimal in theory. Furthermore, it is also possible to obtain a posterior sample of genealogies for the data using this method. We first check the performance of the new method on simulated data and verify its correctness. We also extend the method for inference under models with variable gene-conversion and crossing-over rates and demonstrate its ability to identify recombination hotspots. Then, we apply the method to two empirical data sets that were sequenced in the telomeric regions of the X chromosome of Drosophila melanogaster. Our results indicate that gene conversion occurs more frequently than crossing over in the su-w and su-s gene sequences while the local rates of crossing over as inferred by our program are not low. The mean tract lengths for gene-conversion events are estimated to be ∼70 bp and 430 bp, respectively, for these data sets. Finally, we discuss ideas and optimizations for reducing the execution time of our algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relative influences of crossing over and gene conversion on the pattern of linkage disequilibrium in Arabidopsis thaliana.

In this article we infer the rates of gene conversion and crossing over in Arabidopsis thaliana from population genetic data. Our data set is a genomewide survey consisting of 1347 fragments of length 600 bp sequenced in 96 accessions. It has several orders of magnitude more markers than any previous nonhuman study. This allows for more accurate inference as well as a detailed comparison betwee...

متن کامل

Linkage disequilibria and the site frequency spectra in the su(s) and su(w(a)) regions of the Drosophila melanogaster X chromosome.

Over the last decade, surveys of DNA sequence variation in natural populations of several Drosophila species and other taxa have established that polymorphism is reduced in genomic regions characterized by low rates of crossing over per physical length. Parallel studies have also established that divergence between species is not reduced in these same genomic regions, thus eliminating explanati...

متن کامل

مقایسه روش های بیزی در ارزیابی ژنومی با معماری متفاوت ژنتیکی

The aim of this study was to compare different methods of Bayesian (parameteric) approaches for predicting genomic breeding values of traits with different genetic architecture in different distribution of gene effects, number of  quantitative traits loci, heritability and the number of reference population using simulated data. A genome contained 3 chromosomes, with the length of 100 cM and 10...

متن کامل

Bayesian approach to inference of population structure

Methods of inferring the population structure‎, ‎its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance‎. ‎In this article‎, ‎first‎, ‎motivation and significance of studying the problem of population structure is explained‎. ‎In the next section‎, ‎the applications of inference of p...

متن کامل

Accuracy of Genomic Prediction under Different Genetic Architectures and Estimation Methods

The accuracy of genomic breeding value prediction was investigated in various levels of reference population size, trait heritability and the number of quantitative trait locus (QTL). Five Bayesian methods, including Bayesian Ridge regression, BayesA, BayesB, BayesC and Bayesian LASSO, were used to estimate the marker effects for each of 27 scenarios resulted from combining three levels for her...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 189  شماره 

صفحات  -

تاریخ انتشار 2011